Skip to main content

View Post [edit]

Poster: hank_b Date: Apr 7, 2010 10:03pm
Forum: texts Subject: Re: OCR on Fraktur

It's true, ABBYY does offer an add-on module for performing OCR of Fraktur, but it's not presently available to us: ABBYY FineReader comes to us embedded in the LuraTech software we use for PDF compression, and LuraTech has not licensed the Fraktur module. It's not possible for us to add standalone modules to FineReader as embedded in the LuraTech software.

Besides Fraktur, we have the same problem with the "CJK" languages (Chinese, Japanese, Korean) and Hebrew - ABBYY offers modules for each, but those modules are not included in the LuraTech software, and we can't add them separately.

There may be some changes in how we do OCR around the end of 2010, but I'm afraid we won't be able to handle Fraktur at least until then.

Hank Bromley
Internet Archive

Reply [edit]

Poster: Mark McKinney Date: Apr 8, 2010 4:00pm
Forum: texts Subject: Re: OCR on Fraktur

Hello Hank and 0mat. Thanks for the note on Fraktur and CJK. Over the last couple of months in follow up to a conversation with Brewster, LuraTech has been implementing support for Fraktur and CJK on our Win32 application. Based on this, there should be a way to extend this to our Linux implementation relatively soon. It's getting closer, but we've been waiting to give you a call with the news until we had firmer dates and details. Drop me an email, Hank, and we can discuss further.

Reply [edit]

Poster: nicoremond Date: Oct 9, 2013 6:21am
Forum: texts Subject: Re: OCR on Fraktur

Hello,

I apologize for unearthing this 3 years-old post, I've been searching on the forum to find some more recent news, to no avail.

I would very much want to know what is the present situation with Fraktur texts, as it appears that the text I've accessed in "Full text" mode are not correctly processed by OCR.

Is there any chance that archive.org will be able to provide OCR version of its very numerous Fraktur documents?

I am well aware that ABBYY asks a high price for access to its Fraktur technology (http://www.frakturschrift.de/en:pricing ), is there really no way they would collaborate with archive.org? German publications up to mid 19th century are so incredibly important in so many fields of science and culture, it would be a superb progress to be able to search these assets.


Thank you very much in advance for any available information.

Best,

Nicolas

Reply [edit]

Poster: Jeff Kaplan Date: Nov 4, 2013 9:37pm
Forum: texts Subject: Re: OCR on Fraktur

we do not OCR Fraktur at this time.

Reply [edit]

Poster: t.thorsted Date: Nov 4, 2013 7:39am
Forum: texts Subject: Re: OCR on Fraktur

I am also interested in have Fraktur script available for OCR. I have a few hundred uploads in old German and old Danish.

Has anyone looked into Tesseract? They have a Fraktur language module. It is all open source.

Reply [edit]

Poster: Nemo_bis Date: Nov 6, 2013 1:57am
Forum: texts Subject: Re: OCR on Fraktur

Sadly, there are very few languages at which tesseract is better than ABBYY (perhaps only Devanagari script and some South-Asian languages), so it's understandable that the IA doesn't have resources to cherry-pick scattered tesseract modules from around the world and adapt the deriving infrastructure to use two systems instead of one... More info on ABBYY's OCR tools for it since 2005: http://www.abbyy.com/Default.aspx?DN=8db336b5-f145-4280-a45f-6bbed092872f http://www.frakturschrift.com/en:products
This post was modified by Nemo_bis on 2013-11-06 09:57:04

Reply [edit]

Poster: 0mat Date: Jul 20, 2010 3:56am
Forum: texts Subject: Re: OCR on Fraktur

Dear Hank, dear mark

thanks for explanation and i'm very glad to hear that You work on this problem. Hopefully You get support following this track and can provide such OCRed texts.
Is there help needed to tag a book if it is in Fraktur?

Best,